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MARTINGALE APPROACH TO STOCHASTIC DIFFERENTIAL 
GAMES OF CONTROL AND STOPPING 1 

By Ioannis Karatzas and Ingrid-Mona Zamfirescu 

Columbia University and Baruch College, CUNY 

We develop a martingale approach for studying continuous-time 
stochastic differential games of control and stopping, in a non-Markovian 
framework and with the control affecting only the drift term of the 
state-process. Under appropriate conditions, we show that the game 
has a value and construct a saddle pair of optimal control and stop- 
ping strategies. Crucial in this construction is a characterization of 
saddle pairs in terms of pathwise and martingale properties of suit- 
able quantities. 

1. Introduction and synopsis. We develop a theory for zero-sum stochas- 
tic differential games with two players, a "controller" and a "stopper." The 
state X{-) in these games evolves in Euclidean space according to a stochas- 
tic functional/differential equation driven by a Wiener process; via his choice 
of instantaneous, nonanticipative control u(t), the controller can affect the 
local drift of this state process X(-) at time t, though not its local variance. 

The stopper decides the duration of the game, in the form of a stopping 
rule r for the process X(-). At the terminal time r the stopper receives 
from the controller a "reward" Jq h(t, X,ut) dt + g(X(r)) consisting of two 
parts: The integral up to time r of a time-dependent running reward h, 
which also depends on the past and present states X(s),0 < s <t, and on 
the present value ut of the control; and the value at the terminal state X{t) 
of a continuous terminal reward function g ("reward" always refers to the 
stopper). 

Under appropriate conditions on the local drift and local variance of the 
state process, and on the running and terminal cost functions h and g, 



Received December 2006; revised June 2007. 
Supported in part by the NSF Grant DMS-06-01774. 

AMS 2000 subject classifications. Primary 93E20, 60G40, 91A15; secondary 91A25, 
60G44. 

Key words and phrases. Stochastic games, control, optimal stopping, martingales, 
Doob-Meyer decompositions, stochastic maximum principle, thrifty control strategies. 

This is an electronic reprint of the original article published by the 
Institute of Mathematical Statistics in The Annals of Probability. 
2008, Vol. 36, No. 4, 1495-1527. This reprint differs from the original in 
pagination and typographic detail. 



1 



2 



I. KARATZAS AND I.-M. ZAMFIRESCU 



we establish the existence of a value for the resulting stochastic game of 
control and stopping, as well as regularity and martingale- type properties 
of the temporal evolution for the resulting value process. We also construct 
optimal strategies for the two players, in the form of a saddle point (u*,r*), 
to wit: the strategy u*(-) is the controller's best response to the stopper's 
use of the stopping rule r*, in the sense of minimizing total expected cost; 
and the stopping rule r* is the stopper's best response to the controller's 
use of the control strategy u*(-), in the sense of maximizing total expected 
reward. 

The approach of the paper is direct and probabilistic. It draws on the 
Dubins-Savage (1965) theory, and builds on the martingale methodologies 
developed for the optimal stopping problem and for the problem of opti- 
mal stochastic control over the last three decades; see, for instance, Neveu 
(1975), El Karoui (1981), Benes (1970, 1971), Rishel (1970), Duncan and 
Varaiya (1971), Davis and Varaiya (1973), Davis (1973, 1979) and Elliott 
(1977, 1982). It proceeds in terms of a characterization of saddle points via 
martingale- type properties of suitable quantities, which involve the value 
process of the game. 

An advantage of the approach is that it imposes no Markovian assump- 
tions on the dynamics of the state-process; it allows the local drift and vari- 
ance of the state-process, as well as the running cost, to depend at any given 
time t on past-and-present states X(s), < s < t, in a fairly general, measur- 
able manner. (The boundedness and continuity assumptions can most likely 
be relaxed.) 

The main drawback of this approach is that it imposes a severe nonde- 
generacy condition on the local variance of the state-process, and does not 
allow this local variance to be influenced by the controller. We hope that 
subsequent work will be able to provide a more general theory for such 
stochastic games, possibly also for their nonzero-sum counterparts, without 
such restrictive assumptions — at least in the Markovian framework of, say, 
Fleming and Soner (2006), El Karoui, Nguyen and Jeanblanc-Picque (1987), 
Bensoussan and Lions (1982) or Bismut (1973, 1978). It would also be of 
considerable interest to provide a theory for control of "bounded variation" 
type (admixture of absolutely continuous, as in this paper, with pure jump 
and singular, terms). 

Extant work: A game between a controller and a stopper, in discrete time 
and with Polish (complete separable metric) state-space, was studied by 
Maitra and Sudderth (1996b); under appropriate conditions, these authors 
obtained the existence of a value for the game and provided a transfinite 
induction algorithm for its computation. 

In Karatzas and Sudderth (2001) a similar game was studied for a linear 
diffusion process, with the unit interval as its state-space and absorption 
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at the endpoints. The one-dimensional nature of the setup allowed an ex- 
plicit computation of the value and of a saddle pair of strategies, based on 
scale-function considerations and under a nondegeneracy condition on the 
variance of the diffusion. Karatzas and Sudderth (2007) studied recently 
nonzero-sum versions of these linear diffusion games, where one seeks and 
constructs Nash equilibria, rather than saddle pairs, of strategies. Always in 
a Markovian, one-dimensional framework, Weerasinghe (2006) was able to 
solve in a similar, explicit manner, a stochastic game with variance that is 
allowed to degenerate; while Bayraktar and Young (2007) established a very 
interesting convex-duality connection, between a stochastic game of control 
and stopping and a probability-of-ruin-minimization problem. 

Along a parallel tack, stochastic games of stopping have been treated 
via the theory of Backwards Stochastic Differential Equations starting with 
Cvitanic and Karatzas (1996), and continuing with Hamadene and Lepeltier 
(1995, 2000) and Hamadene (2006) for games of mixed control/stopping. 

The methods used in the present paper are entirely different from those 
in all these works. 

• The cooperative version of the game has received far greater attention. In 
the standard model of stochastic control, treated, for instance, in the classic 
monograph Fleming and Soner (1992), the controller may influence the state 
dynamics but must operate over a prescribed time-horizon. If the controller 
is also allowed to choose a quitting time adaptively, at the expense of in- 
curring a termination cost, one has a problem of control with discretionary 
stopping [or "leavable" control problem, in the terminology of Dubins and 
Savage (1976)]. General existence/characterization results for such problems 
were obtained by Dubins and Savage (1976) and by Maitra and Sudderth 
(1996a) under the rubric of "leavable" stochastic control; by Krylov (1980), 
El Karoui (1981), Bensoussan and Lions (1982), Haussmann and Lepeltier 
(1990), Maitra and Sudderth (1996a), Morimoto (2003), Ceci and Basan 
(2004); and by Karatzas and Zamfirescu (2006) in the present framework. 
There are also several explicitly solvable problems in this vein: see Benes 
(1992), Davis and Zervos (1994), Karatzas and Sudderth (1999), Karatzas 
et al. (2000), Karatzas and Wang (2000, 2001), Kamizono and Morimoto 
(2002), Karatzas and Ocone (2002), Ocone and Weerasinghe (2006). 

Such problems arise, for instance, in target-tracking models, where one 
has to stay close to a target by spending fuel, declare when one has ar- 
rived "sufficiently close," then decide whether to engage the target or not. 
Combined stochastic control/optimal stopping problems also arise in math- 
ematical finance, namely, in the context of computing the upper-hedging 
prices of American contingent claims under constraints; these computations 
lead to stochastic control of the absolutely continuous or the singular type 
[e.g. Karatzas and Kou (1998), Karatzas and Wang (2000)]. 
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The computation of the lower-hedging prices for American contingent 
claims under constraints leads to stochastic games of control and stopping; 
see Karatzas and Kou (1998) for details. 

Synopsis: We set up in the next section the model for a controlled stochastic 
functional/differential equation driven by a Wiener process, that will be used 
throughout the paper; this setting is identical to that of Elliott (1982) and 
of our earlier paper Karatzas and Zamfirescu (2006). Within this model, 
we formulate in Section 3 the stochastic game of control and stopping that 
will be the focus of our study. Section 4 reviews in the present context 
the classical results for optimal stopping on the one hand, and for optimal 
stochastic control on the other, when these problems are viewed separately. 

Section 5 establishes the existence of a value for the stochastic game, 
and studies the regularity and some simple martingale-like properties of 
the resulting value process evolving through time. This study continues in 
earnest in Section 6 and culminates with Theorem 6.3. 

Section 7 then builds on these results, to provide necessary and sufficient 
conditions for a pair (u, r) consisting of a control strategy and a stopping 
rule, to be a saddle point for the stochastic game. These conditions are 
couched in terms of martingale-like properties for suitable quantities, which 
involve the value process and the cumulative running reward. A similar 
characterization is provided in Section 8 for the optimality of a given control 
strategy u(-). 

With the help of the predictable representation property of the Brownian 
filtration under equivalent changes of probability measure, and of the Doob- 
Meyer decomposition for sufficiently regular submartingales, this character- 
ization leads — in Section 9, and under appropriate conditions — to a specific 
control strategy u*(-) as a candidate for optimality. These same martingale- 
type conditions suggest r*, the first time the value process V(-) of the game 
agrees with the terminal reward g(X(-)), as a candidate for optimal stopping 
rule. Finally, it is shown that the pair (-u*,r*) is indeed a saddle point of the 
stochastic game, and that V(- At*) has continuous paths. 

Notation. The paper is quite heavy with notation, so here is a partial 
list for ease of reference: 

X(t),W u (t): Equations (1), (6) and equation (4), respectively. 

A"(t), A lt (t, r): Exponential likelihood ratios (martingales); equations (3) 

and (25). 

Y u (t, r), Y u (t): Total (i.e., terminal, plus running) cost/reward on the in- 
terval [[t, t]]: equations (8), (23). 

V, V and V(t),V_(t): Upper and lower values of the game; equations (9), 
(11), (12). 

J(t, r): Minimal conditional expected total cost on the interval [[t,r]]; equa- 
tion (14). 
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Z u (t): Maximal conditional expected reward under control u(-), from time 
t onward; equation (19). 

Q u (t): Cumulative maximal conditional expected reward under control u(-), 
at time t; equation (20). 

R u (t): Cumulative value of game under control «(•), at t; equation (36). 

r"(e),r t u : Stopping rules; equation (22). 

£>t(£)if?t : Stopping rules; equation (33). 

H(t,u,a,p): Hamiltonian function; equation (72). 

Saddle Point: Inequalities (10). 

Thrifty Control Strategy: Requirement (66). 

2. The model. Consider the space Q = C([0,T];M n ) of continuous func- 
tions u>:[0, T] — > R n , defined on a given bounded interval [0,T] and tak- 
ing values in some Euclidean space M n . The coordinate mapping process 
will be denoted by W(t, u) = to(t), < t < T, and Tf = a{W(s); < s < t), 
<t <T, will stand for the natural filtration generated by this process W. 
The measurable space {Q.,T^) will be endowed with Wiener measure P, 
under which W becomes a standard n-dimensional Brownian motion. We 
shall denote by F = {J~t}o<t<T the P-augmentation of this natural filtration, 
and use the notation '■= maxo< s <t oj € fi,0 < t < T. 

The o"-algebra of predictable subsets of the product space [0,T] x Q will 
be denoted by V, and S will stand for the collection of stopping rules of the 
filtration F. These are measurable mappings r : 17 — > [0, T] with the property 

{T<t}eF t V0<t<T. 

Given any two stopping rules p and v with p < v, we shall denote by S P}U 
the collection of all stopping rules r 6 S with p<r<u. 

Consider now a predictable (i.e., "P-measurable) a : [0, T] xS)-> L(P ?1 ; M n ) 
with values in the space L(R n ;M n ) of (n x n) matrices, and suppose that 
a(t,u>) is nonsingular for every (t,uj) 6 [0,T] x and that there exists some 
real constant K > for which 

||<T~ 1 (t,tj)|| < K and \o'ij(t, uj) — o~ij(t, ui)\ < K\\cj — Vl<i,j<n, 

hold for every co G £1, uj € £1 and every t € [0, T]. Then for any initial condition 
x £ W 1 , there is a pathwise unique, strong solution X(-) of the stochastic 
equation 

(1) X(t)=x+ C a{s,X)dW{s), 0<t<T; 

Jo 

see Theorem 14.6 in Elliott (1982). In particular, the augmentation of the 
natural filtration generated by X(-) coincides with the filtration F itself. 

Now let us introduce an element of control in this picture. We shall denote 
by il the class of admissible control strategies u: [0, T] x O — > A. These are 
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predictable processes with values in some given separable metric space A. 
We shall assume that A is a countable union of nonempty, compact subsets, 
and is endowed with the cr-algebra A of its Borel subsets. 

We shall consider also a V <S> -4,-measurable function / : ([0, T] x $7) x A — > 
R n with the following properties: 

• for each a S A, the mapping (t,uj) i— > f(t,u,a) is predictable; and 

• there exists a real constant K > such that 

(2) \f(t,u,a)\<K(l + \\uj\\* t ) V0<t<T,cj efi.aeA 

For any given admissible control strategy u(-) £ it, the exponential process 



(3) 



A u (t) := exp V 1 (5, X)f(s, X, u s ),dW(s)) 
-±J*\a-\ S ,X)f( S ,X,u s )\ 2 ds} 



< t < T, is a martingale under all these assumptions; namely, E(A"(T)) = 1 
[see Benes (1971), as well as Karatzas and Shreve (1991), pages 191 and 200 
for this result]. Then the Girsanov theorem (ibid., Section 3.5) guarantees 
that the process 

(4) W u {t) :=W(t) - f a' 1 {s,X)f{s,X,u s )ds, 0<t<T 

Jo 

is a Brownian motion with respect to the filtration IF, under the new prob- 
ability measure 

(5) ¥ u {B):=E[X u {T)-l B ], BeF T , 

which is equivalent to P. It is now clear from the equations (1) and (4) that 

(6) X(t) = x+ f* f(s,X,u s )ds+ [ a(s,X)dW u {s), 0<t<T, 

Jo Jo 

holds almost surely. This will be our model for a controlled stochastic func- 
tional/differential equation, with the control appearing only in the drift 
(bounded variation) term. 

3. The stochastic game of control and stopping. In order to specify the 
objective of our stochastic game of control and stopping, let us consider two 
bounded, measurable functions h:[0,T] x $7 x A —> E and g:IR n — > E. We 
shall assume that the running reward function h satisfies the measurability 
conditions imposed on the drift-function / above, except of course that (2) 
is now strengthened to the boundedness requirement 

(7) \h(t,uj,a)\ < K V0<t<T,wen, aeA 
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To simplify the analysis, we shall assume that the terminal reward function 
g is continuous. 

We shall study a stochastic game of control and stopping with two players: 
The controller, who chooses an admissible control strategy u(-) in £1; and 
the stopper, who decides the duration of the game by his choice of stopping 
rule r G S. When the stopper declares the game to be over, he receives from 
the controller the amount Y u [r) = Y u (0,t), where 

(8) Y u (t,r) :=g(X(r)) + J\{s, X,u s ) ds for r G S i<T , t G S. 

It is thus in the best interest of the controller (resp., the stopper) to make the 
amount Y u (t) as small (resp., as large) as possible, at least on the average. 
We are thus led to a stochastic game, with 

(9) V:= inf sup E u (Y u (t) ) , V:= sup inf E u (Y u (r) ) 

as its upper- and lower-values, respectively; clearly, V_<V. 

We shall say that the game has a value, if its upper- and lower-values 
coincide, that is, V_ = V; in that case we shall denote this common value 
simply by V. 

A pair (u*,r*) Gil x S will be called saddle point of the game, if 

(10) E u *(Y u *{t)) <E u * (Y u * (t*)) <E u (Y u (n)) 

holds for every u(-) G il and r G S. In words, the strategy u*(-) is the con- 
troller's best response to the stopper's use of the rule r*; and the rule r* is 
the stopper's best response to the controller's use of the strategy u*(-). 

If such a saddle-point pair (u*,t*) exists, then it is quite clear that the 
game has a value. We shall characterize the saddle property in terms of 
simple, pathwise and martingale properties of certain crucial quantities; see 
Theorem 7.1. Then, in Sections 8 and 9, we shall use this characterization 
in an effort to show that a saddle point indeed exists and to identify its 
components. 

In this effort we shall need to consider, a little more generally than in (9), 
the upper-value-process 

(11) F(t) :=essinfesssupE"(y ?1 (t,r)|^ t ) 
and the lower-value-process 

(12) V(t) :=esssupessinfE u (y n (t,r)|J- t ) 

rG5 tjT «e y 

of the game, for each tG S. Clearly, V(0) = V, V_(0) = V_, as well as 

(13) g(X(t)) <V(t) <V(t) VtGcS. 

We shall see in Theorem 5.1 that this last inequality holds, in fact, as an 
equality: the game has a value at all times. 
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4. Optimal control and stopping problems, viewed separately. Given 
any stopping rule t G S, we introduce the minimal conditional expected cost 

(14) J(t,r) ^essinfE^Y^r)!.^), 

veil 

that can be achieved by the controller over the stochastic interval 

(15) [[t, r]] := {(s, u) G [0, T] x Q : t(u) < s < r(u)}, 

for each stopping rule r G <St,T- With the notation (14), the lower value (12) 
of the game becomes 

(16) V(t) = esssup J(t, r) > J(t, t) = g(X(t)) a.s. 

T"6<St,T 

By analogy with the classical martingale approach to stochastic control 
[developed by Rishel (1970), Duncan and Varaiya (1971), Davis and Varaiya 
(1973), Davis (1973) and outlined in Davis (1979), El Karoui (1981)], for any 
given admissible control strategy u(-) G it and any stopping rules t, u, r with 
0<t<^<r<T, we have the P"-submartingale property 

E"(^>,r)|.F t ) >^' il (t,r) 

(17) 

for * u (t,r) := J(t,r)+ / h(s,X,u s )ds, 

Jo 

or equivalently, 

> J(t, r) a.s. 

A very readable account of this theory appears in Chapter 16, pages 222-241 
of Elliott (1982). 

4.1. A family of optimal stopping problems. For each admissible control 
strategy u(-) Git, we define the maximal conditional expected reward 

(19) Z u (t) := esssup E M (Y"(t,-r) \T t ), teS, 

that can be achieved by the stopper from time t onward, as well as the 
"cumulative" quantity 

(20) Q u (t) :=Z u (t) + [ h(s,X,u s ) ds = esssup E u {Y u (T)\J r t ); 

JO T£S t ,T 

in particular, 

(21) Z u (t) >Y M (t,t) =g(X(t)), F(t)=essinfZ"(t). 



(18) 



J(u,t)+ / h(s,X,u s )ds 
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Let us introduce the stopping rules 

(22) Tf (e) := inf{ S G [t, T] : g(X(s)) > Z u (s) - e}, r? := r t "(0) 

for each t G S, < e < 1. Then r t u (e) < r t u . 

From the classical martingale approach to the theory of optimal stopping 
[e.g., El Karoui (1981) or Karatzas and Shreve (1998), Appendix D], the 
following results are well known. 

PROPOSITION 4.1. The process Q u (-) = {Q u (t),0 <t<T} is a P u -super- 
martingale with paths that are RCLL (Right- Continuous, with Limits from 
the Left); it dominates the continuous process Y u (-) given as 

(23) Y u {t)=Y u {0,t)=g{X(t))+ [ h(s,X,u s )ds, < t < T; 

Jo 

and Q u {-) is the smallest RCLL supermartingale which dominates Y u (-). 

In other words, Q u (-) of (20) is the Snell Envelope of the process Y u {-). 

Proposition 4.2. For any stopping rules t,u,9 with t < u < 9 < r t n , 
we have the martingale property E u [Q u (9)\J r u ] = Q u (v) a.s.; in particular, 
^(•ArJ) is a F u -martingale. Furthermore, Z u (i) = E u [F w (t, rf )|.F t ] holds 
a.s. 

4.2. A preparatory lemma. For the proof of several results in this work, 
we shall need the following observation; we list it separately, for ease of 
reference. 

Lemma 4.3. Suppose that t, 9 are stopping rules with < t < 9 < T, and 
that u(-), v(-) are admissible control strategies in SI. 

(i) Assume that u(-) = v(-) holds a.e. on the stochastic interval [[t,9]], 
in the notation of (15). Then, for any bounded and T q -measurable random 
variable random variable S, we have 

(24) E v [E\F t ] =E u [Z\r t ] a.s. 

In particular, with t = this gives ~E V [H] = E n [H] . 

(ii) More generally, assume that u(-) =v(-) holds a.e. on {(u,u>) : t(uj) < 
u < 9(uj),uj G A} for some A G Tt. Then (24) holds a.e. on the event A. 

The reasoning is simple: with the notation A"(t, 9) := A u (9)/A u (t) from 
(3) , and using the martingale property of A u (•) under ¥ u , we have E" [A" (t, 9) \J~i\ = 
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1 a.s. In conjunction with the Bayes rule for conditional expectations under 
equivalent probability measures, this gives 

_ A»(t)-E[A"(t,g)5|^ t ] _ EfAUf ^, Fl 

E [fl| ^- A«(t)-E[A«(t,e)|^] " E[A M) ^ t] 

(25) 

= E[A v (t,9)E\T t ] = --- = E v [E\T t ] a.s. 
The second claim is proved similarly. 

4.3. Families directed downward. For any given control strategy v(-) G il 
and stopping rules t, 8 with < t < 9 < T, we shall denote by V[ t! 0] the set of 
admissible control strategies u(-) as in Lemma 4.3 (i.e., with u(-) =v(-) a.e. 
on the stochastic interval [[t, 8}]). 

We observe from (19), (8) and Lemma 4.3 that Z u {8) depends only on 
the values that the admissible control strategy u(-) takes over the stochas- 
tic interval ]]6>,T]] := {(s,uj) G [0,T] x Q:9(lu) < s <T} (its values over the 
stochastic interval [[0,$]] are irrelevant for computing Z u (8)). Thus, for any 
given admissible control strategy v(-) Gil, we can write the upper value (11) 
of the game as 

(26) V{9) = essinf Z u (9) = essinf Z u {9) a.s. 



Lemma 4.4. The family of random variables {Z"(#)} ug y [0 e] is directed 
downward: for any two u 1 (-) G VVfl] and u 2 {-) G V[o,8]j there exists an ad- 
missible control strategy u(-) G V[o,6»] such that we have a.s. 

Z ?l {9) = Z u \9)AZ u2 {9). 

PROOF. Consider the event A := {Z ul (9) < Z" 2 (9)} G T e , and define an 
admissible control process u(-) Gil via u(s,ll>) := v(s,ll>) for < s < 9{oj), 

(27) u(s,uj) := u 1 (s,u;) • 1a(w) + u 2 (s,uj) ■ 1a c {u) for 9{uj) < s < T. 

1 2 

Consider also the stopping rule % :=Tq ■ I a + Tq ■ \a c G Sq^ [notation of 
(22)]. Then from Lemma 4.3(h) we have 

Z"(6)=F?lY™(9,Tj)\fe} 

= E" 1 [y" 1 (9, rf) \F e ] ■ I A + E" 2 [Y u2 (9, rf) \T e ] ■ l A c 

<Z u \9)-l A + Z u \9)-l A c 

(28) 

= E" 1 [Y ul (9, rf )m-l A + E- 2 [Y u2 (9, rf ) \F e \ ■ l A c 
= E"[F«(0,t/)|^] ■ 1 A + E^(9,rf)\^ e ] ■ l A c 
= W 2 {Y"(e,?8)\Fe}<Z"(e), 
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thus, also Z»(0) = Z ul (9) ■ 1 A + Z u " (9) ■ l A c = Z u ' (9) A Z u * (9) , a.s. □ 

Now we can appeal to basic properties of the essential infimum [e.g., Neveu 
(1975), page 121], to obtain the following. 

Lemma 4.5. For each 9 G <S, there exists a sequence of admissible con- 
trol processes {u k (-)}keN C V[o,01; such that the corresponding sequence of 
random variables {Z u (9)}k£^ is decreasing, and the essential infimum in 
(26) becomes a limit: 

(29) V{9)= lim [ Z u \9) a.s. 

fc— >oo 

5. Existence and regularity of the game's value process. For any given 
9 G S, and with {ii fc (-)}fceN C Vro^i the sequence of (29), let us look at the 
corresponding stopping rules 

rf := inf{ S € [9, T] : Z u " (s) = g(X(s))}, k G N, 

via (22). Recall that we have Z u \-) > Z u \-) > g(X(-)) for any integers I > k, 

k t k 

thus, also Tg >Tg> 9. In other words, the resulting sequence {tq }keN 1S 
decreasing, so the limit 

(30) t$:= hm irf 

fc— >oo 

exists a.s. and defines a stopping rule in Sq^t- The values of the process u k (-) 
on the stochastic interval [[0,0]] are irrelevant for computing Z uk (s), s>9 

k 

or, for that matter, Tq . But clearly, 

rf = inf {s G [rg,T] : Z u " (s) = g(X(s))}, k G N, 

holds a.s., so the values of u (•) on [[0, Tq]] are irrelevant for computing t@ , 
A;GN. 

Thus, there exists a sequence {n fc (-)}fc S N C V[o )T *] of admissible control 
strategies, which agree with the given control strategy v(-) G it on the stochas- 
tic interval [[0,r|]], and for which (30) holds. 

We are ready to state and prove our first result. 

Theorem 5.1. The game has a value: for every 9 G 5, we have V{9) = 
V_{9), a.s. In particular, V = V_ in (9). A bit more generally, for every t G S 
and any 9 G S^t, we have, almost surely, 



(31) essinfesssupE n (y"(t,r)|J- t ) = esssupessinf E M (y u (t,r)|^ t ). 
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Proof. From the preceding remarks, we get the a.s. comparisons 
V(e)<E uk [Y uk (9,rf )\T e ] 

= E[A«* {d,Tf)Y"\0,Tf)\r e \ 



E 



A»(0,T fl *)A u (rj.-tf ) \Y%9,tZ)+ / h{s,X,u k a )ds 



To 



for every k G N; recall the computation (25). Passing to the limit as k — > oo, 
and using (30), the boundedness of a -1 , /, h, and the dominated conver- 
gence theorem, we obtain the a.s. comparisons 

V{6)<m v {0ye)y v {0y e )m=^[Y v {e i Tl)\Te]. 

Because v(-) is arbitrary, we can take the infimum of the right-hand side of 
this inequality over v(-) Gil, and conclude 

V(0) <essin£E v [Y v (e,T$)\T e ) 

< esssupessinfE"[y l, (0,r)|^] =V(6). 



veii 



The reverse inequality V(9) > V_{6) is obvious, so we obtain the first claim 
of the theorem, namely, V(9) = V_(9) a.s. 

• As for (31), let us observe that for every given u(-) G 11 we have, on the 
strength of Proposition 4.2, the a.s. comparisons 



essinfesssupE u, (V M (#,T)+ / h(s, X, w s ) ds 
< esssup E" (y u (9, t) + f° h(s, X, u s ) ds 
<E u (esssupE u [r u (0,r)|.Ffl] + f h(s, X,u s ) ds 



Ft 



: E u (e u [Y u (0, r e ") \F e ] + f h(s, X, u s ) ds 
: E u (y u (9, t%) + J° h(s, X, u s )ds 



Ft 



= E u [Y u (t,T%)\F t ]. 

Now repeat the previous argument: fix v(-) G U, write this inequality 
with u(-) replaced by u k {-) G V[o ;T *] [the sequence of (29), (30)] for every 
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k G N, and observe that the last term in the above string is now equal to 
K v [Y v (t, Tq )|J r t]- Then pass to the limit as k — > oo to get, a.s., 

essinf esssupE w (V w (0,r) + / h(s,X,w s ) ds ft) <E v [Y v (t,T^)\T t ]. 
well T es e ,T V Ji J 

The arbitrariness of v(-) allows us to take the (essential) infimum of the 
right-hand side over v(-) € it, and obtain 

essinf esssupE ,0 [y w (t,r)|.Ft] < essinf E v [Y v (t,T$)\F t ] 

< esssup essinf E" [Y v (t, r ) | T t ] , 
re«s e , T v ^ 

that is, the inequality (<) of (31); once again, the reverse inequality is ob- 
vious. □ 

From now on we shall denote by V(-) = V(-) =¥_(•) the common value 
process of this game, and write V = V(0). 

Proposition 5.2. The value process V(-) is right- continuous. 

Proof. The Snell Envelope Q u (-) of (20) can be taken in its RCLL 
modification, as we have already done; so the same is the case for the pro- 
cess Z u (-) of (19). Consequently, we obtain limsup s i t V(s) < lim s j_ t Z"(s) = 
Z u (t), a.s. Taking the infimum over u(-) Gil, we deduce limsup s | t V(s) < 
V(t), a.s. 

In order to show that the reverse inequality 
(32) liminfV(s) > V(t) a.s., 

sit 

also holds, recall the submartingale property of (17) and (18) and deduce 
from it, and from Proposition 1.3.14 in Karatzas and Shreve (1991), that 
the right-hand limits 

J(t+,r) :=lim J(s,r) on {t < r}, J(t+, r) := g(X(r)) on{t = r} 

sit 

exist and are finite, a.s. on the respective events. Now for any t € [0, T] and 
every stopping rule r € <%,t, recall (16) to obtain 

liminf V(s) > liminf J(s, s V r) 

sit sit 

= liminf J(s, t) ■ l{t< T } + liminf J(s, s) ■ lr t=T \. 

But on the event {t = r}, we have almost surely 

UmMJ(s,s) = limMg(X(s)) = limg(X(s))=g(X(t)) = J(t,t) 

sit sit sit 
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by the continuity of g(-); whereas on the event {t < r}, we have the a.s. 
equalities liminf s j 4 J(s, r) =lim s ^ t J (s,t) = J(t+,r). Recalling (18), we ob- 
tain from the bounded convergence theorem the a.s. comparisons 



liminfy(s) > lim J(s,r) =E u (lim J(s,r) 



sit 



\ sit 



t+ 



lim J(s,t)+ / h(r,X,u r )dr 

sit \ Jt 



E u (lim J{s,t] 

sit 



limE M 

sit 



J(s,r)+ / h(r,X,u r )dr 



Ft 



>J(t,T). 



We have used here the right-continuity of the augmented Brownian filtration 
[Karatzas and Shreve (1991), pages 89-92]. The stopping rule r G <St,T is 
arbitrary in these comparisons; taking the (essential) supremum over 5t,r 
and recalling (16), we arrive at the desired inequality (32). □ 

5.1. Some elementary submartingales. By analogy with (22), let us in- 
troduce now for each t € S and < e < 1 the stopping rules 

(33) g t (e):=M{sE[t,T]:g(X(s))>V(s)-e}, Q t :=Qt(0)- 

Since 



(34) 



V(-)=essmfZ u (.)>g(X(-)) 



holds a.s. thanks to (26), we have also 

(35) g t V 7f (e) < r t u , g t (e) < r«(e) A g t . 

For each admissible control strategy u(-) 6 it, let us introduce the family 
of random variables 



(36) 



R u (t):=V(t)+ I h(s,X,u s )ds>Y u (t), teS. 
Jo 



For any time t € S, the quantity R u (t) represents the cumulative cost to the 
controller of using the strategy u(-) on [[0,t]], plus the game's value at that 
time. 



Proposition 5.3. For each tt(-) G it, the process R u (- A g ) is a ^-sub- 
martingale. A bit more generally, for any stopping rules t, $ with t< i? < gt, 
we have 



(37) 

or, equivalently, 



E u [iT(#)|.F t ]>R u (t) 



a.s., 



(38) 



V{d)+ / h(s,X,u s )ds 



>V(t) 



a.s. 
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Furthermore, for any stopping rules s,t, $ with < s < t < •& < Qi, we have 
almost surely 

r-d 



(39) 



essinf E 1 



V(#)+ / h(s,X,u s )ds 



F s 



> essinf E" 

«6U 



V(t) + f h(s,X,u s )ds 



F s 



Proof. For any admissible control strategy u(-) Gil, and for any stop- 
ping rules t,# with < t < •& < g u we have E u [Q u (tf)|.F t ] = Q u (t) or, equiva- 
lently, 



(40) 



E" 



Z u ($)+ / h(s,X,u s )ds 



F< 



Z u (t) > V(t) a.s. 



from (21), (35) and Propositions 4.1 and 4.2. Now fix a control strategy 
v(-) G il, and denote again by Vum the set of admissible control strategies 
u(-) as in Lemma 4.3 that agree with it [i.e., satisfy it(-) = v(-) a.e.] on the 
stochastic interval [[t, #]]. From this result and (40), we obtain 



(41) 



Z u (d) + / h(s,X,v s )ds 



Ft 



Z u (t) > V(t) 



a.s. 



Now select some sequence {u fc (-)}fcgN C V\iM as in (29) of Lemma 4.5, substi- 
tute u k (-) for u(-) in (41), let k — > oo, and appeal to the bounded convergence 
theorem for conditional expectations to obtain 



V{'d)+ / h{s,X,v s )ds 



F t 



>V(t) 



a.s. 



This gives (38), therefore, also 



E" 



V(#)+ / h{s,X,u s )ds 



F s 



>E" 



V(t) + j\(s,X,u s )ds 



F s 



for all u(-) Gil. The claim (39) follows now by taking essential infima over 
u(-) G il on both sides. □ 



Proposition 5.4. For every teS, we have 
(42) V(t) = essinf E u (g(X(g t )) + [" h(s,X,u s ) ds 



Ft 



a.s. 



As a consequence, 

(43) V(t) = essinf E u (v(g t ) + /** h(s, X, u s ) ds 



Ft 



a.s. 



and for any given v(-) Gil, we get in the notation of (26): 
(44) R v (t) = essinf E u {R u { 6i )\F t ) a.s. 
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Proof. The definition (11) for the upper value of the game gives the in- 
equality (>) in (42). For the reverse inequality (<), write (38) of Proposition 
5.3 with <& = g t and recall the a.s. equality V(gt) = g(X(g t )), a consequence of 
the definition of Qi in (33) and the right-continuity of V(-) from Proposition 
5.2; the result is 

V(t) <E u (v(qi)+ J** h(s,X,u s )ds\fi 

= E u (g(X( ei )) + j* h(s, X, u s ) ds\j\ 
for every u(-) Gil. Now (42) and (43) follow directly, and so does (44). □ 



a.s. 



Remark 5.1. Proposition 5.3 implies that the process R u (- A go), which 
is right-continuous by virtue of Proposition 5.2, admits left-limits on (0,T] 
almost surely; cf. Proposition 1.3.14 in Karatzas and Shreve (1991). Thus, 
the process R u (- A go) is a P"-submartingale with RCLL paths, and the 
process V(- A go) has RCLL paths as well. 

6. Some properties of the value process. We shall derive in this section 
some further properties of V(-), the value process of the stochastic game. 
These will be crucial in characterizing, then constructing, a saddle point for 
the game in Sections 7 and 9, respectively. 

Our first such result provides inequalities in the reverse direction of those 
in (37) and (38), but for more general stopping rules and with appropriate 
modifications. 



Proposition 6.1. For any stopping rules t,0 with < t < < T, and 
any admissible control process u(-) Git, we have 



(45) 
and 

(46) E l 



E u [R u (0)\F t }<esssupE u (Y u (T)\F t ) 

re5 t t 



V(0) + J h(s,X,u s )ds 



<esssupE u (y"(t,r)|^ t ) = Z u (t) 



almost surely. We also have 



(47) 
and 
(48) 



essinfE" 



essinf E 1 



,0 

V{0) + J h(s,X,u s )ds 
V{0) + J h(s,X,u s )ds 



<V(t) 



<v(t) 



a.s. 



a.s. 



for any given v(-) Gil in the notation used in (26). 
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Proof. We recall from (26), (19) and Theorem 5.1 that V(9) = 
ess inf uei x Z u (9) ; and from Proposition 4.1 that, for any given u(-) Gil, the 
process Q u (-) = Z u {-) + / ' h(s, X,u s ) ds is a P^-supermartingale. We have, 
therefore, 



E u [R u (9)\f t ] = E U 

<E U 



(49) 



V(0)+ / h(s,X,u s )ds 
Jo 

Z u (9) + / h{s,X,u s )ds 
Jo 



f 



<Z u (i)+ [ h{s,X,u a )ds 
Jo 

= esssupE' il (y u (r)|J- t ), 

which is (45). Now (46) is a direct consequence; and (47) and (48) follow by 
taking essential infima over u(-) in it and in Vmt], respectively. □ 

We have also the following result, which supplements the "value identity" 
of equation (31). In this equation the common value is at most V(t), as we 
are taking supremum over a class of stopping rules, S$ t, which is smaller 
than the class <St,T appearing in (11) and (12). The next result tells us exactly 
how smaller than V(t) this common value is: it is given by the left-hand side 
of (47). 

Proposition 6.2. For any stopping rules t,9 with < t < 6 < T, we 
have almost surely 



essinf E u 



V{6) + J h(s,X,u s )ds 



(50) 



: essinf esssup E u (Y u (t, t)\J-{) 
- esssup essinf E u (Y " (t, r ) | T K ) . 



Proof. The second equality is, of course, that of (31). For the first, 
note that Proposition 5.4 gives V{9) < E u (g(X(g e )) + //" h(s, X, u s ) ds\T e ) 
a.s., for every admissible control strategy u(-) Git, thus, also 



E" 



ft 



(51) 



V{9)+ J h{s,X,u s )ds 
< E u (g(X(g e )) + J*° h(s, X, u s ) ds 
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= E u (Y u (t, ee )\F t ) 
<esssupE M (y u (t,r)|J- t ) a.s. 

Taking essential infima on both sides over u(-) G it, we arrive at the inequal- 
ity (<) in (50). 

For the reverse inequality, note from the definition of (19) that 
Z u (9) + J° h{s, X, u s ) ds > E u (Y u (t, T )\Fe) 

holds a.s. for every u(-) G it and every r G Sqt [i n fact, with equality for the 
stopping rule, r = Tq of (22)]. Taking conditional expectations with respect 
to T\ on both sides, we obtain 



(52) K u (z»(6) + J h(s,X,u s )ds\^ >E u (Y u (t,T)\f t ) 
for all t &Sg t x, again with equality for r = Tq ; therefore, 

(53) E u (z u {6) + [° h(s,X,u s )ds F t ) =esssupE u (Y u (i,T)\F l ) 



a.s. 



Fix now an admissible control strategy v(-) Git, and consider a sequence 
{ ufc (")}feeN C V[ ti g] such that V(0) = lim^oo J, Z uk (6) a.s., in the manner of 
(29) in Lemma 4.5. Write (53) with u k (-) in place of u(-) and recall property 
(24) of Lemma 4.3 to obtain 



E v (z u \o) + f h(s,X, v s )ds 



= E uk [z uh (6) + J° h(s, X, u k s ) ds 

= esssupE" fc (y nfe (t,r)|^ t ) 

>essinfesssupE n (y"(t,r)|^" t ) a.s. 

for every fc£N. Now let k — > co and use the bounded convergence theorem, 
to obtain 

E v (v(6)+ [ h(s,X,v s )ds T t ) >essinfesssupE u (y"(t,r)|J c " t j 



Since v(-) Gil is an arbitrary control strategy, all that remains at this point 
is to take the essential infimum of the left-hand side with respect to v(-) G it, 
and we are done. □ 
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We are ready for the main result of this section. It says that inf ug uE u (i? u (-)), 
the best that the controller can achieve in terms of minimizing expected 
"running cost plus current value," does not increase with time; at best, this 
quantity is "flat up to qq" the first time the game's value equals the reward 
obtained by terminating the game. 



Theorem 6.3. For any stopping rules t, 9 with < t < 9 < T, we have 
(54) 

for any v(-) Gil, as well as 



essinf E u (R u (0)\F t ) < R v (t) a.s. 



(55) 



inf E U (R V (9)) < inf E u (K a (t)) < V(0). 

•iiGU tiGH 



The first (resp., the second) of the inequalities in (55) is valid as equality if 
9 < Qt {resp., t < go) also holds. 

A bit more generally, for any stopping rules 5, t, 9 with < s < t < 9 <T , 
we have the a.s. comparisons 



(56) 



essinf 
well 



V{9) + J h(s,X,u s )ds 



< essinf E" 



V(t) + J h(s,X,u s )ds 



<V(s). 



The first (resp., the second) of the inequalities in (56) is valid as an equality 
on the event {9 < qi} (resp., {t< g s }). 

Proof. With v(-) € il fixed, and with V[o,t] as in Lemma 4.5, we have 
essinf E u (R u (9)\f t ) 



: essinf E u 



V ( d ) + J t h(s,X,u s )ds T t + J^h(s,X,u s )ds 
V{9) + ^ h(s, X, us) ds\r t + jT* h(s, X, v s ) ds 

<V(t)+ f h(s,X,v s )ds = R v {t) a.s. 
Jo 



< essinf E u 



where the penultimate comparison comes from (48). This proves (54). 
To obtain the first inequality in (56), observe that (49) gives 



E" 



V(9)+ [ h(s,X,u s )dsF s <E U Z u (i)+ f 'h(s,X,u s )ds 

Js Js 



a.s. 
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for all u(-) Gil. Proceeding just as before, with v(-) Gil arbitrary but fixed, 
and with a sequence C V[o,t] such that V(t) = lim^oo j Z u (t) 

holds almost surely, as in Lemma 4.5, we have 



essinfE" 



<E U ' 
= E V 



V{6)+ / h(s,X,u s )ds 
V{6) + J h(s,X,u k s )ds 
Z uk (t) + J h{s,X, u k s )ds 
Z u \t)+ [ h(s,X,v s )ds 



for every k G N, thus, also 



essinfE 1 
ueix 



V(9)+ / h(s,X,u s )ds 



<E? 



V(t)+ j\(s,X,v s )ds 



in the limit as k — > oo. Take the essential infimum of the right-hand side 
over v(-) G il to obtain the desired a.s. inequality 



essinfE" 



V(9)+ / h{s,X,u s )ds 



^essinfE 1 ' 



V(t) + J h(s,X,v s )ds 



the first in (56). [The reverse inequality holds on the event {9 < Qi}, as we 
know from (39).] The second inequality of (56) follows from the first, upon 
replacing 9 by t, and t by 5. 

Now (55) follows directly from (56), just by taking s = there. □ 



7. A martingale characterization of saddle-points. We are now in a po- 
sition to provide necessary and sufficient conditions for the saddle-point 
property (10), in terms of appropriate martingales. These conditions are of 
obvious independent interest; they will also prove crucial when we try, in 
the next two sections, to prove constructively the existence of a saddle point 
(u*,r*) for the stochastic game of control and stopping. 

Theorem 7.1. A pair (u*,t*) Gilx S is a saddle point as in (10) for 
the stochastic game of control and stopping, if and only if the following three 
conditions hold: 

(i) g(X(n)) = V(r,), a.s. 

(ii) R u (• At*) is a ¥ u -martingale; and 

(iii) R u {- At*) is a ¥ u -submartingale, for every u(-) Gil. 
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The present section is devoted to the proof of this result. We shall derive 
first the conditions (i)-(iii) from the properties (10) of the saddle; then the 
reverse. 



Proof of necessity. Let us assume that the pair («*, r*) G 11 x S is a 
saddle point for the game, that is, that the properties of (10) are satisfied. 

• Using the definition of ft, the submartingale property E" [R u (gt)\J-t\ > 
R u (t) from Proposition 5.3, the a.s. comparisons Y u (r*) < R u (r*) and 
Y u (ft„) = R u (Qt«), and the first property of the saddle in (10), we obtain 

E u * (Y u * (r*)) < E u * (R u * (r*)) < E u * (22"* (ftj) 
= E U * (Y u * (g Tt )) <E U * (Y u * (n)). 

But this gives, in particular, E u * (Y u * (r*)) = E u * (i? u * (r*)), which, coupled 
with the earlier a.s. comparison, gives the stronger one Y u (r*) = R u (r*), 
thus, also g{X{T*)) = V(t*). 

• Next, consider an arbitrary stopping rule r £ 5 with < r < r* and observe 
the string of inequalities 

E u * (iT* (r)) < E M * (i? u * (ft.)) = E u * (r u * (ft)) 
<E M *(r il *(n)) = E' il *( J R"*(n)) 

from Proposition 5.3, the definition of ft, the first property of the saddle, 
and property (i) just proved. On the other hand, from the second property 
of a saddle, from property (i) just proved and from the inequality (55), we 
get the second string of inequalities 

E"*(Y U >*)) = inf E"(Y u (n)) = inf E u (R u (t*)) 
< inf E v (R u (t)) < E"* (R u * (r)). 
Combining the two strings, we deduce 

(57) E u *(R u *(t)) = inf E u (R u (t)) = inf E U (R U (r*)) = E u * (R u * (r*)) 
ugh «gu 

for every stopping rule r G 5 with < r < This shows that i? M *(- A t*) 
is a P" -martingale [cf. Exercise 1.3.26 in Karatzas and Shreve (1991)], and 
condition (ii) is established. 

• It remains to show that, for any given v (•) G it, the process R"(- A t*) 
is a P v -submartingale; equivalently, that for any stopping rules t, r with 
< t < t < r* , the inequality 



(58) E v 



V{t)+ / h{s,X,v s )ds 



Fx 



> V(t) holds a.s. 
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Let us start by fixing a stopping rule r as above, and recalling from (47) of 
Proposition 6.1 that 



(59) V(t; t) := essinf E" 



V{t)+ / h(s,X,u s )ds 



-Ft 



<V(t) 



holds a.s. We'll be done, that is, we shall have proved (58), as soon as we 
have established that the reverse inequality 



(60) 



V(t;T)>V(t) holds a.s. 



as well, for any given r £ S with t < r < r* . 

To this effect, let us consider for any e > the event A £ and the stopping 
rule 9 £ given as 

A £ :={V{t)>V{t;r)+e}eT t and 6 e := t • l Ae + r • l A o, 

respectively, and note < t < 9 e < r < r* < T. From (57), we get 

E"* (R u * (t)) = E"* {R u * (6 £ )) = E u * [R u * (t) • U s + (t) • Ug] 

= E«* [iT* (t) • l Ae + E«* (iT* (r) |^ t ) • 1 A? ] 



E" 



V(t) ■ l Ae + E u * V(t) + / /t(s, X, u*) dsPA- l A 



+ / h(s,X,u*)ds 



V(i) ■ l Ae + V(i; t) ■ l A c + / h(s, X, <) ds 



V(t;r) + / /i(s,X,<)ds 



>E U 

>e-P u " (A £ ) +E U " 

That is, 

(61) E u * (iT* (t)) - e • P u * (A e ) > E"* 

As in (48), we write now the random variable V(t;r) of (59) in the form 



V(t;r) + / h(s,X,u*)ds 



V(t; t) = essinf E 



uGU {o,t] 



lim E 

k— >oo 



V(t)+ / /i(s,X,u s )ds 



V(r)+ / h(s,X,u h s )ds 



for some sequence {w fe (-)}fceN i n U~L t j, the set of admissible control strategies 
u(-) € it that agree with u*(-) a.e. on the stochastic interval [[0, t]]. Back into 
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(61), this gives 

E u *(iT*(t)) -e-F u *(A e ) 

> E"* 
= E"* 
= E U * 



limE" [V(r)+ / h(s,X,u k s )ds T t ) + \ h(s,X,u*)ds 



limE" V(r)+ / h(s,X,u«)ds 



limE ufc (i? nfe (r)|^ t ) 



= limE"* [E"" (iT* (r) \T t )) (bounded convergence) 

= limE" fc [E ufc (i? ufc (r)|^ t )] [equation (24), Lemma 4.3] 
k 

= limE ufc ( J R' ilfc (r)) > inf E u (R u (r)) = E u * (R u * (r)) = E u * {R u * (t)). 

The last claim follows from (57), the martingale property of (ii) that this im- 
plies, and < t < r < r*. This shows P(A e ) = 0, and we get V(t) < V(t; r) + e 
a.s., for every e > 0; letting e { 0, we arrive at (60), and we are done. □ 

Proof of sufficiency. Let us suppose now that the pair (tt*,r*) G 
It x S satisfies the properties (i)-(iii) of Theorem 7.1; we shall deduce from 
them the properties of (10) for a saddle-point. 

The P"-submartingale property of R u (- A r*) in property (iii) gives E u (R u (t)) < 
E u (i? u (r*)) for all u(-) Gil, thus, also 

inf E u {R u (t)) < mfE u (R u (n)). 

Taking here r = and using the property (i) for r*, as well as the P"*- 
martingale property of R u (• At») from (ii), we get 

inf E u (Y u (n)) = inf E u (iT(r*)) >R u (0) = V = R u * (0) 

= E u *(i? ,l *(n))=E' ll *(y u *( n )). 

Comparing the two extreme terms in this string, we obtain the second prop- 
erty of the saddle. 

• We continue by considering stopping rules r G S with < r < . 
For such stopping rules, the fact that R u (• At*) is a P" -martingale [prop- 
erty (ii)] leads to 

(62) Y u \t)<R u \t)=E u *(R u *(t*)\F t )=E u *(Y u *(t*)\F t ) a.s. 

and this gives the first property of the saddle for such stopping rules, upon 
taking expectations. 



24 I. KARATZAS AND I.-M. ZAMFIRESCU 

• Let us consider now stopping rules r£<S with r* < r < T. 

We shall establish for them the first property of the saddle, actually in the 

stronger form 

(63) E u *{Y u *(T)\T T ,)<Y u *{n) a.s. 

Now (63) is equivalent to 



g(X(n))>E u * (g(X(r)) + £ h(t,X,u* t )dt 



Ft, 



= E u *[Y u \t*,t)\^], 

a.s., for every r £ S Titt T, thus to g(X{r*)) > Z u * (r*), a.s. But from (19) and 
(21) the reverse of this inequality always holds, so (63) amounts to the 
requirement 

(64) g(X(n)) = Z u *(n) a.s. 

To prove (64), recall from condition (ii) that R u (-At*) is a ¥ u -martingale, 
and from (36) that it dominates Y u (• A r*). But from Proposition 4.1, 
the process Q u (At*) is the smallest P" -supermartingale that dominates 
Y u ' (■ A t*). Consequently, R u * '(■ A r*) > Q u * {■ A t*) and, equivalently, V(- A 
t*) > ^ u (• At*), hold a.s. But the reverse inequality also holds, thanks to 
the expression (26) for V(-), thus, in fact, V(- At,) = Z u (• At*), a.s. In par- 
ticular, we get V(t*) = Z u (t*) a.s. Now (64) follows, in conjunction with 
condition (i). 

• Finally, let us prove the first property of the saddle for an arbitrary stop- 
ping rule t G S. We start with the decomposition 

E u *(Y u \t))=E u \Y u \t)1 {t ^ } + Y u \t)1 {t> ^ } ) 
= E u *(Y u *(p)l {T < Tt} +Y u *(v)l {T>Tt} ), 

where p := r A t* belongs to <So )T „ and v := t V t* is in S Tx p- Thus, we have 
almost surely 

Y u \ P )<E u *(Y u \t*)\T p ) and E u *(y u »|.F T .) < Y u *(n), 

from (62) and (63). Both events {t < t*}, {t > t*} belong to T p = T T V\ T Tit , 
therefore, 

E«*(y«») = E«*(y«» • i {r < r , } + y«» ■ i {T>Tt} ) 

<E u *(E"*(Y u \t*)\F p ) ■ l {T ^ } + E u \Y u \v)\T p ) ■ l {r>T , } ) 
= E«*(E«*(y"*(n) • l {T < Tt} \T p ) + E U * {Y u * (v)\T Tt ) ■ 1 {t>7 , } ) 

< E«* (F«* (t*) • 1 {T < T , } ) + E«* (F«* (n) • 1 {T>T , } ) = E u * (Y«* (n)). 

This is the first property of the saddle in (10), established now for arbitrary 
tGS. □ 
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8. Optimality conditions for control. We shall say that a given admissi- 
ble control strategy u(-) € 11 is optimal, if it attains the infimum 

(65) V = mfZ v (0), with Z v (0) = sup E v [Y(t)}. 

Here and in what follows, we are using the notation of (19), (22) and (33). 
Clearly, if (u,f) is a saddle pair for the stochastic game, then u(-) is an 
optimal control strategy. 



Theorem 8.1 (Necessary and sufficient condition for optimality of con- 
trol). A given admissible control strategy u(-) G it is optimal, that is, attains 
the supremum in (65) , if and only if it is thrifty, that is, satisfies 

(66) R u {- A r ") is a P" -martingale. 

And in this case, for every < e< 1, we have in the notation of (33) 

(67) r u (e) = g Q (e) a.s. 

Proof of sufficiency. Let us recall from (35) that Tq (e) < Tq holds 
a.s. for every < e < 1, and from Proposition 4.2 that the process Q u {- A Tq ) 
is a P"-martingale. Therefore, if u{-) is thrifty, we have 



V < Z u (0) =E h 



h(s, X, u s ) ds 



e + g(X(T%(e))) + P h(s, X, u s ) ds 
Jo 



< s + E u 



VK(e)) + 



We 



h(s, X, u s ) ds 



= e + E u [R u (T%(e))]=e + V. 

In this string the second inequality comes from the definition of Tq (e) in (22); 
whereas the last equality is a consequence of thriftiness and of the inequality 
Tq (e) < Tq . This gives the comparison V < Z u (0) < e + V for every < e < 1, 
therefore, Z u (0) = V, the optimality of u(-). □ 



Proof of necessity. Let us suppose now that u(-) € H is optimal; we 
shall show that it is thrifty, and that (67) holds for every < e < 1. 
• We shall show first that, for this optimal u(-), we have Tq = £o a - s -j that 
is, (67) with e = 0. 

Let us observe that the P"-martingale property of Q u {- A Tq ), coupled 
with the P"-submartingale property of R u (- A £o) horn Proposition 5.3, and 
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the a.s. inequality go < Tq from (35) give 

reo 

Z u (0)-E u h(s,X,u s )ds 
Jo 

= E u (z u ( eo )) 

= E u [Z u (g ) ■ l {T u =eo} + Z u ( 6 o) ■ 1 W >, }] 
(68) > E u [Z u (g )l {T u =go} + g(X( 0o ))l {T u >go} ] 

= E u [Z u (g )l {T u =go} + V(Q )l K>eo} ] 
> E u [V(g ) ■ l {T « =flo} + V(qo) • l W>eo} ] 

=mv(go)], 

as well as 

V(g ) + / h(s,X,u s )ds 



o 



Z u (0) >E U 

(69) 

= E u [^( £ >o)] >i?"(0) = F. 



We shall argue the validity of Tq = £o by contradiction: we know from 
(35) that £o < T o holds a.s., so let us assume 

(70) P> u >£o)>0. 

Under the assumption (70), the first inequality in (68) — thus also in (69) — is 
strict; but this contradicts the optimality of n(-) Gil. Thus, as claimed, we 
have Tq = £>o a.s. A similar argument leads to Tq (e) = £>o(e) a.s., for every 
< e < 1, and (67) is proved. 

• To see that this optimal u(-) € it must also be thrifty, just observe that, 
as we have seen, equality prevails in (69); and that this, coupled with (67), 
gives R u (0) = E u [R u (t^)]. It follows that the P u -submartingale R u (- A g ) = 
R u {- A Tq ) is in fact a P u -martingale. □ 

The characterization of optimality presented in Theorem 8.1 is in the 
spirit of a similar characterization for optimal control with discretionary 
stopping in Dubins and Savage (1965) and in Maitra and Sudderth [(1996a), 
page 75]. In the context of these two sources, optimality amounts to the 
simultaneous validity of two conditions, "thriftiness" [i.e., condition (67)] 
and "equalization." In our context every control strategy is equalizing, so 
this latter condition becomes moot. 



Proposition 8.2. // the admissible control strategy u(-) Gil is thrifty, 
then it is optimal; and the pair (u,Tq) = (u, go) G il x S is then a saddle 
point for the stochastic game of control and stopping. 



STOCHASTIC DIFF. GAMES OF CONTROL AND STOPPING 



27 



Proof. The first claim follows directly from Theorem 8.1. Now let us 
make a few observations: 

(i) By the definition of go in (33) and the right-continuity of the process 
V(-), we have the a.s. equality V(go) = g(X(go)). 

(ii) The process R u (- A go) is a P w -martingale; this is because u(-), being 
optimal, must also be thrifty, as we saw in Theorem 8.1, and because go = T o 
holds a.s. 

(iii) From Proposition 5.3, the process R v (- A go) is a P^-submartingale, 
for every v(-) Gil. 

From these observations and Theorem 7.1, it is now clear that the pair 
(it, go) is a saddle point of the stochastic game. □ 

9. Constructing a thrifty control strategy and a saddle. The theory of 
the previous section, culminating with Proposition 8.2, shows that in order 
to construct a saddle point for our stochastic game of control and stopping, 
all we need to do is find an admissible control strategy it*(-) G ii which is 
thrifty; to wit, for which the condition (66) holds. Then the pair (u*,Tq ) 
will be a saddle point for our stochastic game. 

To accomplish this, we shall start by assuming that, for each (t, u), the 
mappings 

(71) ah^f(t,u,a) and a t— > h(t, u, a) are continuous, 

and that for the so-called Hamiltonian function 



t G [0, T],u G fi, a G A,p G M. n , the mapping a t— > H(t,uj,a,p) attains its in- 
fimum over the set A at some a* = a*(t,u:,p) G A, for any given (t,uj,p) G 
[0,T] x n x W 1 , namely, 



[This is the case, for instance, if the set A is compact and the mapping a t— > 
H(t,uj, a,p) continuous.] Then it can be shown [see Lemma 1 in Benes (1970), 
or Lemma 16.34 in Elliott (1982)] that the mapping a* : ([0, T]xO)x R n -> A 
can be selected to be {V ® £>(M n )/.4.)-measurable. 

We shall deploy the martingale methodologies introduced in stochastic 
control in the seminal papers of Rishel (1970), Duncan and Varaiya (1971), 
Davis and Varaiya (1973) and Davis (1973), and presented in book form 
in Chapter 16 of Elliott (1982). The starting point of this approach is the 
observation that, for every admissible control strategy it(-) Gil, the process 



(72) 



H(t, uj, a,p) := (p, a (t, io)f(t, u, a)) + h(t, u, a) 



(73) 



inf H(t,u),a,p) = H(t,u>, a* (t,u>,p),p). 



(74) 
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is a P u -submartingale with RCLL paths, and bounded uniformly on [0, T] x 
f2; recall Propositions 5.3, 5.2 and Remark 5.5. This implies that the process 
R w (- A go) admits a Doob-Meyer decomposition 

(75) R u {- Ag ) = V + M u (-) + A u (-). 

Here M u {-) is a uniformly integrable P M -martingale with RCLL paths and 
M"(0) = 0, M u {-) =M u (q ) on [[g ,T]]; the process A u (-) is predictable, 
with nondecreasing paths, A"(T) = A"(go) integrable, and A"(0) =0. 
• A key observation now is that the P"-martingale M u (-) can be represented 
as a stochastic integral, in the form 

(76) M*(.)= f\j(t),dW u (t)). 

Jo 

Here W u {-) is the P"-Brownian motion of (4), and 7(-) a predictable (V- 
measurable) process that satisfies J ||7(i) || 2 dt < oo and 7(-) = on [[go, T]] , 
a.s. 

This is, of course, the predictable representation property of the filtration 
F = {Tt}o<t<T [the augmentation of the filtration = a(W(s);0 < s < 
t),0 < t < T, generated by the P-Brownian motion VF(-)] under the equiv- 
alent change (5) of probability measure. For this result of Fujisaki et al. 
(1972), which is very useful in filtering theory, see Rogers and Williams 
(1987), pages 323 or Karatzas and Shreve (1998), Lemma 1.6.7. An impor- 
tant aspect of this representation is that the same process 7Q works for 
every u(-) Gil in (76). 

Next, let us take any two admissible control strategies u(-) and v (•) in 
it, and compare the resulting decompositions (75) on the stochastic interval 
[[0, go]]- I n conjunction with (74)-(76), (72) and (4), this gives 

(77) A l '(-)-A u (-)= [[H{t,X,v ul (t))-H(t,X,u ul {t))}dt 

Jo 

on the interval [[0, go]]- A brief, self-contained argument for the claims (76) 
and (77) is presented in the Appendix. 

Analysis. If we know that ii(-) € it is a thrifty control strategy, that 
is, the process R u {- A Tq ) is a P"-martingale, then R u {- A go) is also a P u - 
martingale [just recall that we have < go < Tq from (35)], thus A"(-) = 
a.s. But then (77) gives 

A"(.)= f [H(t,X,v tn (t)) - H(t,X,u t , 7 (t))]dt on[[0,g ]]; 
Jo 

and because this process has to be nondecreasing for every admissible control 
strategy v(-) 6 it, we deduce the following necessary condition for thriftiness: 



(78) H{t,X,u t , 7 (t)) = mfH(t,X,a, 1 {t)) a.e. on [[0, g ]]. 
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This is also known as the stochastic version of Pontryagin' ' s Maximum Prin- 
ciple; cf. Kushner (1965), Haussmann (1986) and Peng (1990, 1993). 

Synthesis. The stochastic maximum principle of (78) suggests consid- 
ering the admissible control strategy u*(-) Gil defined by 

a*(t,X, 7 (t)), 0<t< eo ) 
ajj, Qo<t<Tj 

for an arbitrary but fixed element a^ of the control set A. We are using here 
the "measurable selector" mapping a* : [0, T] x O x R n -» A of (73). 
With this choice, (77) leads to the comparison 

A«(.) = A u * (•)+/' [H(t, X, v ul {t)) ~ H(t, X, u* t Mt))} dt > A u * (•) 
Jo 

on the interval [[0,g ]], therefore, also R v (-Ag ) > V + M v (-) + A"* (•) from 
(75), for every v(-) Gil. Taking expectations under we obtain 

0<E u [A u *(g )]<E v [R v (g )]-V Vu(-) Gil. 

But now we can take the infimum over v (•) G il in the above string, and 
obtain 

0< inf E"[A u *(0o)] < mfE v [R v (g )] -V = 0, 

where the last equality comes from (55) and the sentence directly below it. 
We deduce 

(80) infE 1 '[A u *(eo)]=0, thus also A u \g ) = a.s. 

v£tt 

from fairly standard weak compactness arguments, as in Davis (1973) page 592, 
Davis (1979) or Elliott (1982) pages 238-240. 

• We follow now a reasoning similar to that used to prove (64) in Theorem 
7.1: first, we note from (80) that 

(81) R u * (.A Qo ) = V+ f {~{{t),dW u * (t)) is a P"*-martingale, 

Jo 

and from (36) that it dominates Y u (■ A go). But from Proposition 4.1, 
the process Q u (-Ago) is the smallest ¥ u -supermartingale that dominates 
Y u * (■ A go). We deduce that R u * (■ A go) > Q u * '(■ A go) and, equivalently, 
V(' A go) > Z u * (■ A go), hold a.s. The reverse of this inequality also holds, 
thanks to the expression (26) for V(-), thus, in fact, V(- A go) = Z u (• A go). 

In particular, we have almost surely, Z u (go) = V(go) = g(X(go)) (recall 
the definition of go), thus, also Tq < go from (22). Again, the reverse in- 
equality holds, thanks now to (35), so, in fact, Tq = £?o holds a.s. 



(79) 
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We conclude that the property (81) leads to the thriftiness condition (66) 
for the admissible control strategy u*(-) Gil defined in (79). In conjunction 
with Proposition 8.2, this establishes the following existence and character- 
ization result: 

Theorem 9.1. Under the assumptions (71)-(73) of this section, the pair 
(u* , qq) £ il x 5 of (79) and (33) is a saddle point for the stochastic game, 
and we have qq = Tq a.s., in the notation of (22). Furthermore, the process 
V{' A Qo) is a continuous F-semimartingale. 

Only the last claim needs discussion; from (74), (81) and (72), we get the 
representation 

(82) V(t) = V- f H(s,X,u* s , 1 (s))ds+ [\<y(s),dW(s)) 

Jo Jo 

for < t < qq, and the claim follows. 

This equation (82) can be written equivalently "backward," as 

reo reo 

(83) V(t) = g(X(g )) + H(s,X,u*Ms)) ds - ( 7 (s),dW(s)) 

Jt Jt 

for < t < qq. Suitably modified to account for the constraint V(-) > g(X(-)), 
and with an appropriate definition for the "adjoint process" 7Q on [[^o,? 1 ]], 
the equation (83) can be extended to hold on [[0, T]]; this brings us into con- 
tact with the backward stochastic differential equation approach to stochas- 
tic games [Cvitanic and Karatzas (1996), Hamadene and Lepeltier (1995, 
2000), Hamadene (2006)]. 

APPENDIX 

In order to make this paper as self-contained as possible, we shall present 
here a brief argument for the representation (76) of the P"-martingale M u {-) 
in the Doob-Meyer decomposition (75), and for the associated identity (77). 

We start with the "Bayes rule" computation 

M u (t) =E u [M u (T)\f t ] =E u [M u (g )\r t } 

(84) 

= E"[A"(go)M"(go)|^ t ] 

A u (t/\Q ) 

for < t < T [e.g. Karatzas and Shreve (1991), page 193]; then the mar- 
tingale representation property of the Brownian filtration (ibid., page 182) 
shows that the numerator of (84) can be expressed as the stochastic integral 

(85) N u (t):=E u [A u ( Qo )M u ( Qo )\T t ]= f ' (C (s) , dW (s)) , < t < T, 

Jo 



STOCHASTIC DIFF. GAMES OF CONTROL AND STOPPING 31 

with respect to W(-), of some predictable process : [0,T] x f2 — > M. n that 
satisfies £"(•) = a.e. on [[g ,T]], and Jq \\£, u (t)\\ 2 dt < oo a.s. We have re- 
called in (84) and (85) that M u (-) = M u (g ) a.e. on [[g ,T]}, and N u (0) = 
M u (0)A u (0) = 0. 

On the other hand, for the exponential martingale of (3), we have the 
stochastic integral equation 

(86) A u (t Ago) = 1+ I* A u (s)(v u (s),dW{s)), < t < T, 

Jo 

where we have set tp u (t) : = cr~ 1 (t, X)f(t, X, u t ) for < t < go, and f u {t) := 
for g <t<T. Applying Ito's rule to the ratio M u (-) = N u (-)/A u {- A g ) of 
(84), in conjunction with (85), (86) and (4), we obtain then, for < t < T, 

ft f u (t) — N u (t)uo u (t) 

(87) M»(t) = j Q ( 7 u (s),dW u ( S )) where 7 «(t) := * 1 ' ^ 1 } 

is clearly predictable; it satisfies 7 U (-) = a.e. on [[go,T]], as well as 
/o T H7 w (*)ll 2 <ft<°° a - s - 

• It remains to argue that the stochastic integrand of (87) does not depend 
on the admissible control process u{-) Gil, as claimed in (76). Indeed, for 
arbitrary u(-) € it and u(-) G il, we have 

R v (t Ago) - / [h(s, X,v s )- h(s, X, u s )] ds 



R u (t A g ) = V + A u (i) + f\ 1 u {s),dW u {s)) 

Jo 

y + A«(t)+ [\ 1 u (s),dW v (s)) 



+ / W(s),cp v (s))ds- / ( 1 u (s), V u (s))ds, 0<t<T. 
Jo Jo 

Let us compare now this decomposition with the consequence 



R v (tAg ) = V + A v {t)+ f (Y(s),dW v (s)), < t < T, 

Jo 

of (75) and (87). Identifying martingale terms, we see that 7 U (-) = "f v (-) holds 
a.e. on [0, T] x Q, thus, (76) holds; identifying terms of bounded variation, 
we arrive at (77). 
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